Goto

Collaborating Authors

 Merrimack County


Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation

Long, Yunbo, Xu, Liming, Brintrup, Alexandra

arXiv.org Artificial Intelligence

To evaluate the fidelity of synthetic tabular data, numerous metrics have been proposed to assess accuracy and diversity, including both low-order statistics (e.g., Density Estimation and Correlation Score (Zhang et al., 2023), Average Coverage Scores (Zein & Urvoy, 2022)) and high-order statistics (e.g., α-Precision and β-Recall (Alaa et al., 2022)). However, these metrics operate at a high level and fail to evaluate whether synthetic data preserves logical relationships, such as hierarchical or semantic dependencies between features. This highlights the need for a more fine-grained, context-aware evaluation of multivariate dependencies. To address this, we propose three evaluation metrics: Hierarchical Consistency Score (HCS), Multivariate Dependency Index (MDI), and Distributional Similarity Index (DSI). To assess the effectiveness of these metrics in quantifying inter-column relationships, we select five representative tabular data generation methods from different categories for evaluation. Their performance is measured using both existing and our proposed metrics on a real-world dataset rich in logical consistency and dependency constraints. Experimental results validate the effectiveness of our proposed metrics and reveal the limitations of existing approaches in preserving logical relationships in synthetic tabular data. Additionally, we discuss potential pathways to better capture logical constraints within joint distributions, paying the way for future advancements in synthetic tabular data generation.


Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt

de Mijolla, Damien, Yang, Wen, Duckett, Philippa, Frye, Christopher, Worrall, Mark

arXiv.org Artificial Intelligence

Prompting and fine-tuning have emerged as two competing paradigms for augmenting language models with new capabilities, such as the use of tools. Prompting approaches are quick to set up but rely on providing explicit demonstrations of each tool's usage in the model's prompt, thus coupling tool use to the task at hand and limiting generalisation. Fine-tuning removes the need for task-specific demonstrations of tool usage at runtime; however, this ties new capabilities to a single model, thus making already-heavier setup costs a recurring expense. In this paper, we introduce language hooks, a novel framework for augmenting language models with new capabilities that is decoupled both from the model's task-specific prompt and from the model itself. The language hook algorithm interleaves text generation by the base model with the execution of modular programs that trigger conditionally based on the existing text and the available capabilities. Upon triggering, programs may call external tools, auxiliary language models (e.g. using tool specific prompts), and modify the existing context. We benchmark our method against state-of-the-art baselines, find that it outperforms task-aware approaches, and demonstrate its ability to generalise to novel tasks.


A Structural Text-Based Scaling Model for Analyzing Political Discourse

Vávra, Jan, Prostmaier, Bernd Hans-Konrad, Grün, Bettina, Hofmarcher, Paul

arXiv.org Artificial Intelligence

Estimating ideological positions of lawmakers has a long tradition in political science. Poole & Rosenthal (1985) proposed a "scaling procedure" to estimate ideological positions of lawmakers based on their voting behavior. Dynamic weighted nominal three-step estimation (McCarty et al. 1997), an extension of this procedure, results in the DW-Nominate scores that are widely accepted as benchmark ideological positions both on party level as well as on individual level (see, e.g., Poole et al. 2011, Lewis et al. 2022, Boche et al. 2018). Legislative votes, however, provide limited information on the latent ideological positions because voting behavior on individual level is often not documented and lawmakers rarely diverge from party-line voting due to robust party discipline (Hug 2010). Consequently, roll-call analysis for inferring the ideological positions adopted by legislators both within and across parties is of limited value (see, e.g., Lauderdale & Herzog 2016). Text-based scaling models are a promising alternative method to discern ideological stances based on political discussions.


Crackdown on 'deceptive' AI in political ads passes NH House without debate

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Political ads featuring deceptive synthetic media would be required to include disclosure language under a bill passed Thursday by the New Hampshire House. Sophisticated artificial intelligence tools, such as voice-cloning software and image generators, already are in use in elections in the U.S. and around the world, leading to concerns about the rapid spread of misinformation. The New Hampshire State House, in Concord, New Hampshire, as photographed in April 2017.


Dean Phillips distances himself from campaign operative who reportedly paid 1 for AI-generated Biden deepfake

FOX News

Longshot Democratic presidential candidate Rep. Dean Phillips, D-Minn., is distancing himself from a report that one of his campaign's former consultants hired a magician to create a deepfake of President Biden urging New Hampshire voters not to participate in last month's primary. Paul Carpenter, a magician from New Orleans, came forward and said he had made the deepfake for 1 and that a Democratic consultant Steve Kramer had paid him 150 to do it, according to an NBC report. Kramer is a get-out-the-vote specialist who worked on ballot access for the Phillips campaign and also worked on Kanye West's unsuccessful 2020 presidential campaign. "I'm disgusted that a consultant hired to assist my campaign [with] ballot access is alleged to have faked a robocall impersonating Joe Biden," Phillips wrote on X on Friday. "While I don't know the person, such behavior is despicable and I trust will be investigated by authorities. It's also despicable that the Party actively limits access to state ballots and blackballs reputable consultants who would otherwise work with challengers like me. The corruption in politics is pervasive and must be exposed and addressed."


ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

Wadhawan, Rohan, Bansal, Hritik, Chang, Kai-Wei, Peng, Nanyun

arXiv.org Artificial Intelligence

Recent advancements in AI have led to the development of large multimodal models (LMMs) capable of processing complex tasks involving joint reasoning over text and visual content in the image (e.g., navigating maps in public places). This paper introduces ConTextual, a novel benchmark comprising instructions designed explicitly to evaluate LMMs' ability to perform context-sensitive text-rich visual reasoning. ConTextual emphasizes diverse real-world scenarios (e.g., time-reading, navigation, shopping and more) demanding a deeper understanding of the interactions between textual and visual elements. Our findings reveal a significant performance gap of 30.8% between the best-performing LMM, GPT-4V(ision), and human capabilities using human evaluation indicating substantial room for improvement in context-sensitive text-rich visual reasoning. Notably, while GPT-4V excelled in abstract categories like meme and quote interpretation, its overall performance still lagged behind humans. In addition to human evaluations, we also employed automatic evaluation metrics using GPT-4, uncovering similar trends in performance disparities. We also perform a fine-grained evaluation across diverse visual contexts and provide qualitative analysis which provides a robust framework for future advancements in the LMM design. https://con-textual.github.io/


Voice recognition: Leaked Trump tape contradicts denials on sharing Iran war plan

FOX News

Fox News senior national correspondent Kevin Corke and OutKick writer Mary Katharine Ham joined'MediaBuzz' to discuss the former president's sit-down interview with'Special Report' anchor Bret Baier. And few have a more recognizable one than Donald Trump. The media have gone into high-decibel mode over an audio recording, obtained by CNN, which appears to prove that he did show a highly classified document to one or more staffers, contradicting his past denials. You may have read part of the transcript of this 2021 conversation – it's included in the indictment – but there are new details on the tape (including the sound of Trump ruffling papers) that make it more newsworthy. Former President Trump remains the frontrunner for the 2024 Republican nomination.


Controllable Text Generation with Language Constraints

Chen, Howard, Li, Huihan, Chen, Danqi, Narasimhan, Karthik

arXiv.org Artificial Intelligence

We consider the task of text generation in language models with constraints specified in natural language. To this end, we first create a challenging benchmark Cognac that provides as input to the model a topic with example text, along with a constraint on text to be avoided. Unlike prior work, our benchmark contains knowledge-intensive constraints sourced from databases like Wordnet and Wikidata, which allows for straightforward evaluation while striking a balance between broad attribute-level and narrow lexical-level controls. We find that even state-of-the-art language models like GPT-3 fail often on this task, and propose a solution to leverage a language model's own internal knowledge to guide generation. Our method, called CognacGen, first queries the language model to generate guidance terms for a specified topic or constraint, and uses the guidance to modify the model's token generation probabilities. We propose three forms of guidance (binary verifier, top-k tokens, textual example), and employ prefix-tuning approaches to distill the guidance to tackle diverse natural language constraints. Through extensive empirical evaluations, we demonstrate that CognacGen can successfully generalize to unseen instructions and outperform competitive baselines in generating constraint conforming text.


Collect Data, Influence Votes: 'If Then' Traces The Genesis Of Data-Driven Politics

NPR Technology

A collection of current and past presidential advertising materials hang on a wall in November in the visitor center of the New Hampshire State House in Concord, N.H. A collection of current and past presidential advertising materials hang on a wall in November in the visitor center of the New Hampshire State House in Concord, N.H. Decades before Google or Facebook existed, a Madison Avenue advertising man started a company called Simulmatics based on a then-revolutionary method of using computers to forecast how people would behave. Formed in 1959, Simulmatics charged clients a hefty fee to access its "people machine" -- a computer program that drew on polling information and behavioral science to predict mathematically the impact of an advertising pitch or political message. The New Yorker's Jill Lepore writes about Simulmatics in her new book, If Then: How the Simulmatics Corporation Invented the Future.


"Free" Tablets Are Costing Prison Inmates a Fortune

Mother Jones

Wayne Snitzky was 18 years old when he was sentenced to prison for murdering a girl four years his junior. It was 1995, beepers were the height of personal technology, and the most sophisticated video game he had played was "Leisure Suit Larry," a 2D adult-themed computer game that followed the sexual exploits of the sleazy main character. During the 23 years he has been locked up in Ohio's Marion Correctional Institute, Snitzky has been on the periphery of technology's rapid evolution. While he was able to maintain some of his computer skills through a work program with a nonprofit, where he now teaches other inmates basic tech skills such as composing emails and using a word processor, Snitzky's access to communication was limited. But in the early 2000s, inmates at Marion got their first taste of email--though a far different version than the one most users know.